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Abstract 

As  military  operations  in  degraded  or  GPS-denied  environments  continue  to 
increase  in  frequency  and  importance,  there  is  an  increased  necessity  to  be  able  to 
determine  precision  location  within  these  environments.  Furthermore,  authorities  are 
finding  a  record  number  of  tunnels  along  the  U.S. -Mexico  border;  therefore,  under¬ 
ground  tunnel  characterization  is  becoming  a  high  priority  for  U.S.  Homeland  Security 
as  well.  This  thesis  investigates  the  performance  of  a  new  image  registration  technique 
based  on  a  two  camera  optical- flow  configuration  using  phase  correlation  techniques. 
These  techniques  differ  from  other  image  based  navigation  methods  but  present  a 
viable  alternative  increasing  autonomy  and  answering  the  tunnel  based  navigation 
problem.  This  research  presents  an  optical  flow  based  image  registration  technique 
and  validates  its  use  with  experimental  results  on  a  mobile  vehicle.  Results  reveal 
that  further  extension  to  pose  estimation  yields  accuracy  within  1.3  cm. 
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Optical  Flow-Based  Odometry 


for 

Underground  Tunnel  Exploration 

I.  Introduction 

This  thesis  describes  the  development  of  a  robust  optical  flow  based  image  regis¬ 
tration  technique  for  navigation  aiding  in  Global  Positioning  System  (GPS)  denied 
environments.  This  research  is  motivated  by  the  requirement  for  precision  navigation 
in  uncharacterized  tunnel  environments. 

Authorities  are  finding  a  record  number  of  tunnels  along  the  U.S. -Mexico  border, 
which  is  alarming  for  U.S.  security  as  smugglers  are  becoming  more  ingenuitive  in  their 
drug  and  human  smuggling  tactics.  Since  the  Secure  Border  Fence  Act  passed  in  2006, 
tunnel  discoveries  have  skyrocketed  as  shown  in  Figure  1.1  [18]. 

Determining  the  origin  and  destination  of  unknown  tunnels  is  gaining  increasing 
importance.  GPS  is  viable  only  when  there  is  an  unobstructed  view  of  the  sky  to  re¬ 
ceive  the  satellite  signals,  making  it  unavailable  for  underground  navigation.  Robotic 
platforms  used  for  reconnaissance  in  the  unknown  tunnel  environment  will  not  only 
be  a  valuable  asset  to  GPS  denied  areas,  but  a  necessity  for  future  operations. 

1 . 1  Current  Technology 

Researchers  have  implemented  many  navigation  techniques  to  solve  the  robotic 
navigation  problem  [3].  However,  most  of  these  methods  do  not  allow  for  extension 
into  the  underground  navigation  problem  [13].  GPS-navigation  has  been  a  proven 
method  for  autonomous  navigation  only  when  an  unobstructed  view  of  the  sky  is 
available,  therefore  this  is  not  a  viable  option.  Signals  of  opportunity  is  another 
navigation  technique  that  exploits  signal  communications  already  in  existence,  how¬ 
ever  similar  to  GPS-navigation  SoOP  requires  an  unobstructed  view  of  transmission 
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Figure  1.1:  Surge  in  Tunnel  Excavations  [18] 

sources  which  is  not  available  underground.  Dead  reckoning  based  odometry  is  an 
appealing  technique  due  to  cost  and  simplicity  but  is  not  reliable  due  to  error  ac¬ 
cumulations  that  would  be  magnified  in  the  underground  environment  terrain  [11], 
Since  the  underground  tunnel  environment  had  not  been  previously  characterized, 
navigation  based  on  a  priori  maps  is  not  possible.  Laser-based  navigation  is  a  viable 
option  that  provides  precision  navigation  solutions  but  simply  due  to  the  cost  of  the 
system  is  not  viable  for  underground  navigation. 

Vision  aided  navigation  on  autonomous  mobile  vehicles  dates  back  to  the  late 
1970s,  when  Moravec  and  Gennery  used  feature  tracking  for  visual  correction  [15]. 
Visual  odometry  has  also  been  used  successfully  with  autonomous  aircraft  and  un¬ 
derwater  vehicles  [2]  [26].  Despite  the  advances,  use  of  visual  techniques  and  quality 
research  in  the  application  area  using  robots  has  been  slow.  Optical  flow  has  been 
well  researched  in  theory  but  not  in  actual  use  and  the  practical  applications  of  the 
theory  have  yet  to  be  explored  in  detail  [5]. 

A  possible  solution  to  navigating  in  GPS-denied  environments  is  optical  flow 
odometry  based  image  registration,  which  can  be  used  to  constrain  inertial  sensor 
drift.  Such  an  image- aided  navigation  system  was  developed  by  Lockheed  Martin 
in  the  1960s  for  military  purposes  and  has  continued  to  evolve  to  subpixel  registra- 
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tion  today  [19]  [35].  Although  there  have  been  significant  advances  in  optical  flow 
odometry  there  is  a  lack  of  research  in  the  application  to  navigation. 

1.2  Scope 

This  effort  develops  and  demonstrates  a  navigation  technique  to  augment  stand-alone 
inertial  sensors  for  improved  navigation  using  optical  flow  odometry  for  improved 
tracking  of  system  position  and  heading.  This  research  will  determine  if  optical  flow 
based  image  registration  technique  is  a  viable  solution  for  accurate  pose  navigation 
estimation.  The  scope  of  this  thesis  is  limited  to  the  development  of  a  new  registration 
algorithm  design  that  registers  images  varying  in  translation  only. 

1 . 3  Organization 

This  thesis  is  organized  in  the  following  manner.  A  mathematical  overview  and  back¬ 
ground  material  in  the  area  of  robot  navigation  are  presented  in  Chapter  II.  The 
methodology  used  to  develop  the  optical  flow  based  image  registration  technique  will 
be  covered  in  depth  in  Chapter  III  with  detailed  results  and  analysis  in  Chapter 
IV.  Recommendations  for  future  work  and  final  remarks  will  conclude  this  thesis  in 
Chapter  V. 
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II.  Background 

This  chapter  presents  the  mathematical  and  technical  information  necessary  to  fully 
develop  an  optical  flow  odometry  navigation  solution.  The  chapter  begins  by  defin¬ 
ing  standard  mathematical  notation  which  is  used  throughout  this  document.  Next, 
the  navigation  reference  frames  are  defined  and  a  mathematical  technique  for  trans¬ 
forming  coordinate  reference  frames  will  be  defined.  A  basic  understanding  of  inertial 
navigation  will  follow.  Finally,  a  review  of  image  registration  techniques  followed  with 
current  GPS  denied  navigation  applications  will  be  presented. 

2.1  Mathematical  Background 

The  mathematical  notation  used  throughout  this  thesis  will  be  presented  in  this 
section. 

•  Scalars:  Scalars  are  represented  with  lower  or  upper  case  italics  font,  (e.g.,  x  or 

X) 

•  Vectors:  Vectors  are  represented  by  lower  case  font  with  a  vector  bar  over  the 
variable,  (e.g.,  x) 

•  Matrices:  Matrices  are  denoted  by  upper  case  bold  font,  (e.g.,  X) 

•  Direction  Cosine  Matrices:  Direction  cosine  matrices  from  frame  a  to  frame  b 
are  denoted  by 

•  Reference  Frame:  If  a  vector  is  expressed  in  a  specific  reference  frame,  a  super¬ 
script  letter  is  used  to  designate  the  reference  frame  (e.g.,  xn  is  the  vector  x  in 
the  n-frame) 

2.1.1  Coordinate  Systems  and  Transformations.  Navigation  is  an  ancient 
skill  or  art  which  has  become  a  complex  science.  It  is  essentially  about  finding  the 
way  from  one  place  to  another  and  there  are  a  variety  of  means  in  which  this  can 
be  achieved  [10].  Navigation  is  performed  in  relation  to  a  frame  of  reference  (e.g., 
following  directions  on  a  map  based  on  local  surroundings).  Navigation  systems  have 
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multiple  applications  but  are  often  developed  to  enable  a  vehicle  to  correctly  de¬ 
termine  a  navigation  solution.  Coordinate  systems  and  transformation  provide  the 
foundation  for  defining  the  position  and  orientation  of  the  navigation  state  relative 
to  the  applicable  environment. 

2. 1.1.1  Reference  Frames.  Position  can  be  described  in  terms  of  a  va¬ 
riety  of  coordinate  systems  called  reference  frames,  each  from  a  different  perspective. 
For  instance,  describing  movement  with  respect  to  the  earth  requires  a  different  frame 
of  reference  than  movement  on  a  Cartesian  plane.  The  foundation  to  navigation  is 
knowledge  of  position  and  orientation  relative  to  a  specific  reference  frame.  Under¬ 
standing  reference  frames  allows  for  changes  in  position  and  orientation  of  a  vehicle 
to  be  quantified.  The  following  three-axis,  right-handed  reference  frames  are  used  in 
this  research  [38]  [40]. 

•  Inertial  frame  (I- frame):  The  true  inertial  frame  is  a  theoretical  reference  frame 
defined  by  a  non-accelerating,  non-rotating,  three-axis,  right-handed  coordinate 
system  with  no  predefined  origin  or  orientation  where  Newton’s  laws  of  motion 
apply. 

•  Earth-Centered  frame  (i-frame):  The  Earth-Centered  frame  is  an  orthonormal 
three- axis,  right-handed  coordinate  system  with  origin  at  the  center  of  mass  of 
the  Earth.  The  x  and  y  axis  are  located  on  the  equatorial  plane,  aligned  with 
the  fixed  stars.  The  /’-frame  is  a  non-rotating  frame  but  does  accelerate  with 
respect  to  the  /-frame. 

•  Earth-Centered  Earth-Fixed  frame  (e- frame):  The  Earth-Centered  Earth-Fixed 
frame  is  a  three-axis,  right-handed  coordinate  system  with  the  origin  at  the 
Earth’s  center  of  mass.  The  e-frame  is  rigidly  attached  to  the  Earth,  with  the 
2-axis  aligned  with  the  north  pole,  .x-axis  aligned  with  the  intersection  of  the 
Greenwich  meridian  and  the  y- axis  aligned  with  the  equatorial  plane  pointing 
toward  the  90  degrees  east  longitude.  Figure  (2.1)  presents  the  e-frame. 
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Figure  2.1:  Earth-Centered  Earth-Fixed  Frame. 


•  Local- Level  Navigation  frame  (n- frame):  The  Earth- fixed  Local-Level  Naviga¬ 
tion  frame  is  a  three-axis,  right-handed  coordinate  system  with  origin  located 
at  a  point  defined  on  the  vehicle  body.  The  vehicle’s  fixed  navigation  frame 
(xn,  yn ,  zn)  point  in  the  north,  east,  and  down  (NED)  directions,  respectively. 
The  down  direction  is  defined  to  align  with  the  local  gravity  vector.  Figure  2.2 
presents  the  n- frame. 

•  Body  frame  (6- frame):  The  body  frame  is  a  three-axis,  right-handed  coordinate 
system  rigidly  attached  to  the  vehicle  body  with  origin  co-located  with  the 
navigation  frame.  Looking  at  the  body  from  an  aerial  view;  the  Xb,  ijb,  and  Zb 
axes  point  out  front,  right  and  bottom  of  the  body,  respectively. 

•  Sensor  frame  (s-frame):  The  sensor  frame  is  a  three-axis,  right-handed  coor¬ 
dinate  system.  The  origin  of  an  individual  s-frame  is  defined  relative  to  each 
individual  sensor.  Within  this  research  each  origin  of  an  s-frame  is  co-located 
with  with  the  6-frame,  but  each  sensor  has  a  unique  x,  y,  and  z  orientation. 

2.1.2  Coordinate  Transformations.  Coordinate  transformations  are  used  to 
resolve  measurements  from  one  reference  frame  into  another.  Describing  the  relation¬ 
ship  between  two  reference  frames  is  fundamental  to  the  process  of  inertial  naviga¬ 
tion  [38] .  The  two  coordinate  transformations  pertinent  to  this  research  are  direction 
cosine  matrices  (DCM)  and  Euler  angles.  A  DCM  is  a  3  x  3  matrix  used  to  describe 
a  vector  in  a  different  coordinate  frame  according  to 
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Figure  2.2:  Earth-fixed  Local-Level  Navigation  Frame. 

fb  =  ChJa  (2.1) 

where  r“  is  a  vector  expressed  in  an  arbitrary  reference  frame  a.  rb  is  the  same  vector 
described  in  the  6-frame  and  C!)  is  the  DCM  that  describes  three-dimensional  rotation 
matrix  between  the  a-frame  and  the  6-frame.  The  element  in  the  i-tli  row  and  the  j-th 
column  of  Cl)  represents  the  cosine  of  the  angle  between  the  i- th  axis  of  the  a-frame 
and  the  j- th  axis  of  the  6-frame  [38]. 

Euler  angles  are  angles  that  describe  the  relationship  between  two  reference 
frames  and  are  used  to  build  the  DCM.  A  series  of  three  rotations  about  three  different 
axes  will  transfer  a  system  from  one  coordinate  frame  to  another.  Rotations  of  xp 
about  the  z- axis,  6  about  the  ?y-axis  and  (t>  about  the  x-axis  can  be  described  by  three 
separate  DCMs 
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(2-2) 


cos  ip  sin  ip  0 
Ci  =  —  sin^  cos  ip  0 

0  0  1 

cos  6  0  —  sin  6 

c2=  o  1  0  (2.3) 

—  sin  6  0  cos  6 

1  0  0 

C3  =  0  cos  <p  sin  <p  (2.4) 

0  —  sin  cp  cos  (p 

When  performing  a  transformation  from  the  navigation  frame  (n-frame)  to  the  body 
frame  (5-frame)  the  axes  may  be  expressed  as  the  product  of  the  three  transforms  as 
follows: 

C„  =  C3C2C!  (2.5) 

Using  the  DCM,  a  vector  pb  in  the  body  frame  can  be  transformed  to  the  navigation 
frame  by 

pb  -  C^p"  (2.6) 

The  inverse  transformation  from  the  body  to  the  navigation  frame  is  given  by: 

C£  =  Cf  =  C1tC2tC3t  (2.7) 
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2.2  Image  Registration 

Image  registration  is  the  process  of  taking  two  images  of  the  same  scene  that  are 
taken  from  different  viewpoints  or  sensors  and  geometrically  aligning  them.  Image 
registration  can  be  divided  into  the  following  categories:  correlation  methods  which 
use  image  pixel  values  directly  [4],  fast  Fourier  transform-based  (FFT)  methods  which 
use  the  frequency  domain  [23] ,  feature  based  methods  which  use  low-level  features  such 
as  edges  and  corners  [11],  and  graph-theoretical  methods  which  use  high-level  features 
such  as  specific  objects  or  relationships  between  features  [11]. 

The  registration  method  used  for  this  research  utilizes  the  Fourier  domain  ap¬ 
proach  to  match  images  that  differ  in  translation.  The  Fourier  method  differs  by 
conducting  all  computations  in  the  frequency  domain  as  opposed  to  the  spacial  do¬ 
main  [11]  using  the  phase  correlation  technique  [23].  Performing  calculations  in  the 
frequency  domain  allows  for  excellent  robustness  against  correlated  and  frequency- 
dependent  noise. 

2.2.1  Digital  Image  Processing.  Digital  image  processing  is  the  use  of  com¬ 
puter  vision  to  process  digital  images.  Modern  digital  technology  makes  DSP  the 
cheapest  and  most  versatile  method  of  image  processing.  Digital  image  processing 
allows  for  a  much  wider  range  of  algorithms  than  analog  processing  making  it  more 
efficient  at  simple  tasks  and  the  only  practical  solution  for  such  tasks  as  feature  ex¬ 
traction,  pattern  recognition,  and  image  registration  to  name  a  few  [24] . 

2.2.2  Cross-Correlation.  One  approach  to  image  registration  maximizes 
the  correlation  between  two  images  by  normalizing  the  cross-correlation,  which  is 
calculated  directly  in  the  spatial  domain.  Normalized  cross-correlation  is  typically 
used  in  order  to  make  the  similarity  measure  independent  of  uniform  changes  in  the 
image  intensities.  [41] 

Given  two  images  I\  and  /2,  the  normalized  cross-correlation,  c,  of  I\  and  /2  at 
a  relative  shift  (Ax,  Ay)  is 
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(2.8) 


c  _  E^LiE^liCM^)  -fr){h{x-  &x,y-  Ay)  -/^) 

ySUEjliCM^y)  -  /t)2(/2(£  -  A x,y  -  Ay)  -  /r2)2 

where  /ii  and  //2  are  the  mean  values  (average  pixel  value  over  the  entire  image)  of 
ii  and  /2,  respectively.  For  image  Jr  that  is  m  pixels  tall  and  n  pixels  wide,  /jr  is 
calculated  as 


mn 


hj= 1 


(2.9) 


The  cross-correlation  is  evaluated  for  each  pixel  combination  between  I\  and  J2 
and  the  2D  coordinate  position  of  the  maximum  of  the  cross-correlation  represents 
relative  displacement  between  the  images. 

The  cross-correlation  technique  in  the  spatial  domain  is  computationally  ex¬ 
pensive  due  to  the  necessity  of  evaluating  every  displacement.  The  cross-correlation 
method  is  an  adequate  application  for  small  amounts  of  data,  but  processing  im¬ 
ages  using  the  cross-correlation  method  is  unrealistic  due  to  computational  cost.  The 
phase  correlation  method,  the  product  of  the  Fourier  transform  of  one  image  and  the 
complex  conjugate  of  the  Fourier  transform  of  the  other  image,  drastically  decreases 
the  computational  cost  because  the  cross-correlation  is  done  in  the  frequency  domain 
instead  of  the  spatial  domain. 


2.2.3  Phase  Correlation.  Phase  Correlation  Method  (PCM)  is  a  well  known 
image  registration  method  that  was  first  proposed  by  Kuglin  and  Hines  in  1975  [23]. 
PCM  exploits  the  Fourier  Shift  Theorem  property  of  Fourier  Transform  and  estimates 
translational  misalignment.  This  method  is  robust  to  frequency-dependent  noise  and 
dissimilar  images,  e.g.,  varying  lighting  or  atmospheric  conditions. 

Consider  two  images  p  and  /2  of  the  same  scene  that  differ  only  by  a  displace¬ 
ment  (Am,  An).  See  Figure  2.3. 

The  corresponding  Fourier  transforms  F\  and  Fo  are  related  by 
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Figure  2.3:  Image  Shift:  /2  is  a  Shifted  Copy  of  R. 


F2(u>m,  CJn)  =  F1(um,  ^e-j^mAm+.nAn)  (2.10) 

Equation  2.10  shows  that  the  two  images  have  the  same  Fourier  magnitude  but  a 
different  phase  difference  directly  corresponding  to  the  displacement  (Am,  An). 

Calculate  the  normalized  phase  correlation  by  taking  the  complex  conjugate 
of  the  second  image,  multiplying  the  Fourier  transforms  together  element  wise,  and 
normalizing  the  product  element  wise 

D  _  Fi(ux,U )y)  FI  {uJx^UJy)  j(u}xAx+uyAy)  12  111 

\F1(ujx,ujy)F^(ujx,ujy)\  {  •  j 

Obtain  the  normalized  cross-correlation  by  applying  the  inverse  Fourier  transform  to 

R 


r  =  F~\R ) 

Determine  the  location  of  the  shift  between  I\  and  I2  by 


(2.12) 


(Ax,  Ay )  =  argmax(>i!)){r} 


(2.13) 
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Figure  2.4:  Periodic  Extensions  of  the  Original  Image  Due  to  Fourier  Methods  of 
Cross-Correlation. 


The  discrete  convolution  theorem  presumes  that  the  signal  is  periodic.  Real 
data  often  consists  of  one  non-periodic  stretch  of  data  with  finite  length  [33].  Figure 
2.4  illustrates  the  effects  of  the  assumption  that  the  signal,  an  image  in  this  case, 
is  periodic  and  shows  the  repetition  of  the  image.  This  interference  is  called  the 
wraparound  effect  [30]. 

To  maintain  the  validity  of  the  phase  correlation  method,  it  is  necessary  to  find 
a  workaround  for  the  wraparound  constraint.  Since  the  convolution  theorem  assumes 
that  the  data  is  periodic  and  pollutes  the  image  with  extraneous  data,  a  buffer  zone 
needs  to  be  added  to  the  image  to  protect  the  data  from  these  effects.  Adding  a  buffer 
zone  of  zeros,  called  zero  padding,  will  not  introduce  data  that  will  interfere  with  the 
calculation  of  the  displacement  but  will  prevent  the  wraparound  effect.  Figure  2.5 
illustrates  the  zero-padded  original  image. 

The  Phase  Correlation  method  offers  several  remarkable  properties:  immunity 
to  uniform  variations  of  illumination,  insensitivity  to  changes  in  spectral  energy  and, 
most  importantly,  excellent  peak  localization  accuracy  [39] .  To  establish  a  measure  of 
correctness  the  curvature  of  the  cross-correlation  between  the  images  is  examined.  The 
curvature  points  in  the  direction  of  maximum  rate  of  change  and  allows  the  algorithm 
to  differentiate  between  an  actual  shift  and  an  erroneous  shift.  It  is  assumed  that 
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Figure  2.5:  Zero-Padded  Image  To  Prevent  the  Wraparound  Effect. 

the  slope  is  small  compared  with  unity  so  the  approximation  for  the  curvature  of 
the  cross-correlation,  g(x,y),  can  be  found  by  taking  the  second  derivative  of  the 
cross-correlation,  r(x,y ),  (Equation  2.14)  [17]. 

g(x,y)  =  [\72(r(x,y))]  (2.14) 

The  use  of  the  curvature  overcomes  the  difficulty  of  being  able  to  resolve  the  correct 
displacement  and  maintains  the  computational  simplicity  of  the  frequency  based  cross¬ 
correlation  method. 

Evaluation  of  the  correctness  of  the  cross-correlation  peak  can  be  assessed  based 
on  how  much  the  curvature  of  the  cross-correlation  peak  is  above  the  noise  floor  using 
Equation  2.16.  Dividing  the  maximum  curvature  value,  ma x(g(x,y)),  by  the  mean  of 
the  curvature,  /ia, 


Ha 


1 

mn 


n,m 

d(x,y) 

x,y=l 


(2.15) 
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Figure  2.6:  Cross-Correlation  Plot  Based  on  Fourier  Methods:  Difficult  to  Auto¬ 
matically  Resolve  Displacement  Seen  Highlighted  in  Black  Circle. 

provides  a  metric  (A)  to  assess  the  correctness  of  the  displacement  calculated  by 


A  = 


ma x(g(x,y)) 
Ma 


(2.16) 


It  can  be  seen  in  Figure  2.6  that  the  displacement  determined  by  the  cross-correlation 
peak,  highlighted  in  the  black  circle,  of  two  images  will  not  be  resolved  by  evaluating 
the  maximum  value  of  the  cross-correlation  because  it  is  not  the  maximum.  The 
curvature  of  the  cross-correlation  as  shown  in  Figure  2.7  resolves  the  displacement 
very  clearly  by  evaluating  the  maximum  rate  of  change  of  the  cross-correlation. 


2.3  Non- GPS  Navigation  Techniques 

Over  the  past  couple  of  decades  the  motivation  to  improve  the  ability  to  navigate 
in  all  environments  has  gained  increasing  importance.  Precision  navigation  in  GPS- 
denied  environments  is  becoming  a  necessity  in  modern  operations.  GPS-navigation 
requires  receiver  line-of-sight,  thus  as  missions  continue  to  move  underground,  it  is 
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Figure  2.7:  Curvature  of  the  Cross-Correlation  Plot:  Displacement  Easily  Resolved 
With  the  Curvature  of  the  Cross-Correlation. 

not  a  viable  solution.  The  problems  in  GPS  availability  necessitate  the  development 
of  non-GPS  dependent  navigation  techniques. 

The  advantages  of  non-GPS  dependent  navigation  techniques  increase  the  mo¬ 
bility  and  control  by  not  limiting  our  reach  to  areas  where  line-of-sight  to  satellites 
is  available,  providing  navigation  redundancy  where  GPS  is  denied.  It  provides  the 
capability  to  navigate  and  characterize  unknown  environments  in  areas  where  GPS 
cannot  reach.  Non-GPS  navigation  can  be  used  to  guide  robotic  platforms  into  re¬ 
gions  of  the  battlefield  where  GPS  is  unavailable.  Non-GPS  navigation  can  provide 
U.S.  and  allied  forces  the  ability  to  conduct  timely  reconnaissance,  surveillance,  and 
target  acquisition  in  unfamiliar  urban  environments  currently  being  investigated  by 
the  Dragon  Runner  project  [37]. 

Recent  literature  has  demonstrated  numerous  methods  of  precision  navigation 
without  GPS,  from  using  signals  of  opportunity  (SoOP)  to  vision  aided  approaches. 
The  remaining  sections  outline  several  approaches  for  non-GPS  navigation  and  high¬ 
light  the  advantages  and  disadvantages  for  underground  robot  navigation. 
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2. 3. 1  Dead  Reckoning.  A  common  method  of  calculating  odometry  is  based 
on  dead  reckoning.  Dead  reckoning  is  the  process  of  estimating  current  position  based 
on  a  previously  determined  position.  Initial  mobile  robotics  used  dead  reckoning  to 
estimate  a  robot’s  position  and  orientation  with  respect  to  a  local  reference  frame 
placed  in  the  environment.  This  is  an  attractive  means  of  keeping  track  of  position 
because  it  is  inexpensive  and  can  be  done  in  real-time.  Dead  reckoning  is  usually 
accomplished  with  wheel,  axle,  or  gear  encoders  by  counting  the  number  of  rotations 
followed  by  converting  the  number  of  rotations  into  ground  distance  traveled.  The 
primary  short  fall  in  of  dead  reckoning  based  upon  odometry  measurements  is  that 
each  rotation  does  not  necessarily  correspond  to  ground  distance  traveled  [9]  [8] .  The 
main  short-coming  of  this  approach  is  the  skid-steering  design  of  mobile  robots.  The 
skid-steering  mobile  robot  is  designed  to  have  two  differentially  driving  wheel  pairs 
on  each  side  of  the  robot  but  lacks  a  steering  device,  this  design  introduces  uniform 
wheel  movement  that  does  not  necessarily  correspond  to  ground  distance  traveled  [21]. 
Other  forms  of  error  that  can  be  introduced  with  the  dead  reckoning  method  are  gravel 
or  muddy  terrain  when  axle  rotations  do  not  directly  relate  to  motion.  Furthermore, 
the  position  estimations  are  based  on  the  previously  calculated  positions,  thus  the 
errors  of  the  process  are  cumulative  and  unbounded  [38]. 

2.3.2  Navigation  Based  on  A  Priori  Maps.  Another  navigation  technique 
relies  on  a  priori  maps,  a  map  that  has  predefined  details  about  the  environment. 
For  this  application,  navigation  is  conducted  by  recognizing  and  matching  features 
to  localize  current  pose  in  the  map  [27].  This  navigation  technique  is  not  a  viable 
solution  for  underground  environments  as  discussed  in  this  thesis,  because  no  previous 
knowledge  of  the  environment  is  assumed. 

2.3.3  Laser-Based  Navigation.  A  wide  range  of  laser-based  sensors  are 
currently  being  applied  to  the  GPS-denied  navigation  problem  [22]  [27]  [14]  [6] .  Laser 
sensors  are  able  to  capture  fine  angular  and  distance  resolution,  real  time  behavior 
with  hundreds  of  point  measurements  per  second  and  with  low  false  positive  rates. 


16 


Efficient  algorithms  exist  for  mapping  and  localization  using  this  technique.  However, 
image  based  sensors  are  often  preferred  due  to  laser  technology  costs  with  typical 
sensors  costing  an  order  or  magnitude  more  [22], 

2.3.4  Signals  of  Opportunity  Navigation.  Signals  of  opportunity  based  nav¬ 
igation  estimates  position  using  multi-lateration  techniques  similar  to  GPS.  Pseu¬ 
doranges  are  created  using  Time  Difference  of  Arrival  (TDOA)  distances  between  a 
reference  receiver  and  a  mobile  receiver,  allowing  the  receiver  to  obtain  a  position 
estimate  over  time.  Similar  to  the  difficulties  with  using  GPS,  SoOP  navigation  re¬ 
quires  the  use  of  transmission  sources  such  as  Amplitude  Modulation  (AM)  signal 
characteristics  that  are  not  viable  in  an  underground  setting  [29]. 

2.3.5  Visual  Based  Navigation.  Visual  based  navigation  techniques  are 
typically  classified  as  either  feature-based  or  optic  flow-based  [3]  [16].  Feature-based 
methods  determine  the  correspondence  of  features  in  a  scene  over  multiple  frames  and 
optical  flow-based  techniques  determine  correspondence  for  a  whole  image  between 
frames. 

Feature  tracking-based  navigation  methods  have  been  proposed  both  for  fixed- 
mount  imaging  sensors  or  gimbal-mounted  detectors.  Many  feature  tracking-based 
navigation  methods  exploit  knowledge  of  the  target  location  and  solve  the  inverse 
trajectory  projection  problem  [1],  Feature-based  navigation  techniques  such  as  simul¬ 
taneous  localization  and  mapping  (SLAM)  determine  changes  in  pose  by  matching  key 
points  between  images.  This  technique  is  popular  because  it  is  robust  to  changes  in 
scale,  illumination,  and  rotation.  The  major  downfall  in  implementing  this  technique 
in  underground  environments  is  the  lack  of  features  to  detect  and  track  [25]. 

Optic  flow  methods  have  been  proposed  generally  for  elementary  motion  de¬ 
tection,  focusing  on  determining  relative  velocity,  angular  rates,  or  obstacle  avoid¬ 
ance  [20].  Optical-flow  based  odometry  is  useful  for  a  variety  of  reasons,  e.g.,  cameras 
are  fairly  compact  and  inexpensive  and  are  able  to  be  mounted  on  very  small  robots. 
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Also,  with  the  rapid  decreases  in  the  cost  and  power  requirement  of  computation,  the 
expense  of  using  visual  odometry  is  often  diminutive  compared  to  other  methods  [36] . 
Even  though  there  are  existing  visual  odometry  techniques  based  on  geometric  in¬ 
ference,  which  are  highly  accurate  and  work  well  in  a  variety  of  environments,  this 
research  will  explore  visual  odometry  methods  which  are  based  on  determining  pose 
changes  without  explicitly  determining  the  scene  or  camera  geometry.  Due  to  the 
complexity  and  sensitivity  of  camera  calibration  with  geometric  visual  odometry  these 
techniques  are  often  difficult  to  execute.  Furthermore,  the  configuration  parameters 
are  specific  to  each  camera  and  lens  requiring  modification  for  any  changes  made  to 
the  system. 

2.4  Summary 

The  background  chapter  presented  the  mathematical  and  technical  information 
necessary  to  fully  develop  an  optical  flow  odometry  navigation  solution  which  will  be 
used  throughout  this  document.  The  navigation  reference  frames  were  then  defined 
and  a  mathematical  technique  for  transforming  coordinate  reference  frames.  Next,  a 
review  of  image  registration  techniques  followed  with  current  GPS-denied  navigation 
applications. 

The  methodology  used  to  develop  an  optical  flow  based  registration  algorithm 
will  be  covered  in  depth  in  Chapter  III  with  detailed  results  and  analysis  in  Chapter 
IV.  Recommendations  for  future  work  and  final  remarks  will  conclude  this  thesis  in 
Chapter  V. 
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III.  Methodology 

The  purpose  of  the  optical  flow  based  registration  algorithm  is  to  constrain  drift 
present  in  an  inertial  sensor  when  GPS  is  not  available.  The  development  of  a  regis¬ 
tration  algorithm  optimized  for  phase  correlation  between  images  as  a  vehicle  moves 
through  an  underground  environment  is  fundamentally  an  estimation  problem  involv¬ 
ing  mapping  from  an  image  scene  at  time  (t)  to  an  image  scene  at  time  (t  +  1) .  This 
chapter  outlines  the  optical  flow  algorithm  used  for  pose  estimation.  The  methods 
presented  in  this  thesis  for  estimating  vehicle  pose  using  optical  flow  based  image 
registration  techniques  follow  the  basic  steps  shown  in  Figure  3.1. 

3.1  Theory  and  Algorithms 

3.1.1  Phase  One:  Single  Image  Characterization.  Algorithm  testing  began 
with  one  large  image,  taking  a  series  of  known  crops  of  the  image  and  verifying  that  the 
correct  shifts  were  calculated.  To  better  explain  this  technique  consider  the  following 
example.  First,  a  4111  x  6300  pixel  image  shown  in  Figure  3.2  is  cropped  into  two 
1024  x  768  pixel  images  shown  in  Figures  3.3  and  3.4.  Figure  3.3  is  the  reference 
image,  denoted  by  the  solid  rectangle  in  Figure  3.2,  taken  at  time-step  t.  Figure  3.4 
is  considered  the  translated  image  taken  at  time-step  t  +  1,  denoted  by  the  dashed 
rectangle  in  Figure  3.2.  In  this  example  the  difference  between  the  reference  image 
and  the  translated  image  is  a  shift  of  0  pixels  in  the  Y -direction  and  200  pixels  in  the 
X-direction  applied  to  the  translated  image. 

Using  image  registration  techniques  discussed  in  Chapter  II,  the  Fourier  trans¬ 
form  is  taken  of  each  cropped  image  and  the  phase  correlation  is  assessed  by  multi¬ 
plying  the  first  cropped  image  by  the  complex  conjugate  of  the  second  cropped  image. 
The  inverse  Fourier  transform  is  taken  of  the  result  and  the  maximum  cross  corre¬ 
lation  is  assessed  resulting  in  the  displacement  between  the  two  images.  Figure  3.5 
shows  the  cross-correlation  between  the  two  images,  the  peak  represents  the  maxi¬ 
mum  correlation.  The  wraparound  effect  from  the  Fourier  domain  calculations  can 
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Figure  3.1:  Optical  Flow  Based  Registration  Algorithm. 

be  seen  by  the  peaks  represented  at  a  shift  location  of  [200  0]  and  [768  0] .  Effectively 
it  is  unknown  if  the  actual  shift  was  at  [200  0]  or  [200  —  N  0] . 

Curvature  of  the  cross-correlation  provides  a  measure  to  assess  the  correctness 
of  the  predicted  shift  by  identifying  how  much  the  maximum  rate  of  change  of  cross¬ 
correlation  is  above  the  average  curvature  of  the  cross-correlation.  The  displacement 
between  image  a  and  image  b  in  Figure  3.6  can  easily  be  inferred  by  using  the  red 
highlighted  circles.  The  cross-correlation  between  the  images,  as  shown  in  Figure  3.7, 
shows  the  displacement  peak,  highlighted  by  the  black  circle,  being  overshadowed  by 
noise.  The  true  displacement  cannot  be  resolved  automatically  from  the  maximum 
of  the  cross-correlation.  From  Figure  3.7  it  can  be  seen  that  there  are  multiple  peaks 
that  have  a  greater  value  than  the  actual  peak,  but  the  actual  displacement  peak  is 
characterized  by  a  sharp  peak.  Assessing  the  curvature  of  the  cross-correlation  plot 
will  resolve  which  peak  has  the  sharpest  change  in  slope.  Figure  3.8  resolves  the 
noise  and  determines  the  actual  displacement  seen  by  the  single  peak.  This  method 
can  also  be  used  to  determine  if  a  displacement  cannot  be  determined  between  two 
images.  Figure  3.9  shows  a  cross-correlation  plot  where  no  sharp  peaks  are  present, 
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Figure  3.2:  Single  Large  Image. 

indicating  that  an  accurate  displacement  cannot  be  determined.  The  curvature  of  the 
cross-correlation  plot,  as  shown  in  Figure  3.9,  indicates  that  the  maximum  value  found 
from  the  cross-correlation  is  polluted  with  noise  thus  indicating  a  correlation  between 
these  images  could  not  be  determined.  Automatic  evaluation  of  the  maximum  of  the 
curvature  of  the  cross-correlation  to  determine  the  correctness  of  the  prediction  can 
be  assessed  using  Equation  2.15,  where  A  is  a  metric  to  determine  the  validity  of  the 
predicted  displacement. 

3.1.2  Phase  Two:  Single  Camera  Characterization.  The  next  phase  of 
the  optical  flow  odometry  based  image  registration  algorithm  characterization  uses 
a  single  camera.  Evaluating  the  algorithm  with  real-world  images  introduces  images 
with  pixel  disparity  at  matching  points.  Pixel  disparity  is  a  result  of  slight  changes  in 
how  the  real  world  was  mapped  to  each  pixel  between  the  images.  This  effect  may  not 
be  noticeable  to  the  naked  eye,  but  due  to  the  precision  of  each  pixel,  slight  differences 
are  determinable.  The  overcoming  of  these  slight  changes  increases  the  robustness  of 
the  algorithm  making  it  more  reliable  for  real-world  implementation. 


21 


100  200  300  400  500  600  700  800  900  1000 

Pixels 


Figure  3.3:  Image  Crop  1. 

A  series  of  images  are  captured,  as  shown  in  Figure  3.11,  and  similar  to  the 
single  image  setup,  two  sequential  images  are  evaluated  in  the  frequency  domain 
where  the  maximum  of  the  cross-correlation  resolves  the  displacement  between  the 
two  images.  Figure  3.12  shows  image  captured  at  time  (t),  time  (t  +  1),  and  time 
(t  +  2).  By  inspection  of  Figure  3.12,  it  is  clear  that  the  shift  is  in  the  positive 
l''-direction  (assuming  a  orthogonal  right-handed  orthogonal  reference  frame). 

3.1.3  Phase  Correlation.  Converting  two  sequential  images  from  the  spa¬ 
tial  domain  to  the  frequency  domain  where  the  phase  shift  is  related  to  the  relative 
translation  in  the  spatial  domain.  The  inverse  Fourier  transform  of  the  normalized 
phase  correlation  defined  by 


r  =  eiCx Ax+uiyAy) 

|  Pi  (p^x )  My  )  P‘2  ip^x )  ^ y )  | 


(3.1) 
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Figure  3.4:  Image  Crop  2. 

resolves  a  maximum  peak  at  the  location  of  the  displacement  between  the  two  images. 
Figure  3.13  illustrates  the  cross-correlation  and  estimates  the  displacement  by  the 
overall  maximum  peak. 

3.1.4  Phase  Three:  Two  Camera  Characterization.  From  the  readings  of 
two  cameras  it  is  possible  to  compute  the  pose  of  a  mobile  vehicle  independently 
from  the  kinematics.  A  description  of  how  to  estimate  the  vehicle  pose  considering 
that  the  cameras  are  in  a  linear  orientation,  separated  by  a  distance  D,  as  shown  in 
Figure  3.14. 

Each  camera  is  able  to  measure  a  2-dimensional  translation  in  the  coordinate 
system  of  the  vehicle  to  which  they  are  rigidly  attached  using  the  methods  described 
in  Section  3.1.2.  If  the  vehicle  makes  an  arc  of  circumference,  it  can  be  shown  that 
each  camera  makes  an  arc  of  circumference,  which  is  characterized  by  the  same  angle 
but  different  radius.  An  ambiguity  may  arise  if  only  a  single  camera  were  used  for 
odometry  and  heading.  For  example,  Figure  3.15  shows  two  different  paths  that 
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Figure  3.5:  Single  Image  Cross-Correlation. 

resolve  identical  Ax  and  Ay  measurements.  The  path  length,  /,  and  the  angle  between 
the  x-axis  and  the  velocity  vector,  a,  is  the  same  for  both  cases.  Figure  3.15  (a)  moved 
in  a  straight  line,  whereas  Figure  3.15  (b)  moved  in  an  arc  [7]  [28]. 

In  general,  the  following  relationships  can  be  made  between  the  camera  dis¬ 
placements  and  the  path: 


Ax  =  l  cos(a) 


(3.2) 


Ay  =  l  sin(o:). 


(3.3) 


The  angle  a  between  the  x-axis  of  camera  and  the  tangent  is 


a  =  arctan 


The  length  of  the  arc,  l,  can  then  be  resolved  using 


(3.4) 
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Figure  3.6:  Determination  of  Shift  Example. 


{Ax  for  a  =  0, 7r 

(3.5) 

,  v  otherwise. 

lsin(a) 

Over  a  short  sampling  period,  it  is  assumed  that  the  vehicle  moves  with  constant 
tangential  and  rotational  speeds,  therefore  during  a  sampling  period  the  movement 
can  be  approximated  by  the  arc  of  circumference.  The  arc  of  circumference  can  be 
described  by  x  and  y  coordinates  as  well  as  the  rotational  angle  A 9.  Each  camera 
resolves  a  Ax  and  Ay  and  the  distance  between  the  cameras  is  held  constant  at  D. 
This  means  that  the  line  that  joins  the  cameras  should  always  produce  the  same 
displacement  value.  From  these  readings  it  is  possible  to  calculate  the  vehicle  pose 
changes  in  terms  of  Ax,  Ay,  and  A 6.  The  cosine  rule  is  applied  to  the  triangle  made 
by  joining  the  radius  of  each  camera  (r\  and  r-2 ) ,  the  joining  line  (D),  and  the  angle 
between  them  (7)  as  shown  in  Figure  3.16.  While  7,  is  described  by  the  radii  of  both 
cameras  it  can  be  computed  by  7  =  |cui  —  <22 1- 
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Figure  3.7:  Cross-Correlation:  Actual  Shift  is  Overshadowed. 


D2  =  r\  +  r2  —  rir2  cos(7)  (3.6) 

The  ratio  between  the  arc  length  (l)  and  the  arc  angle  (0)  describe  the  radius 
(r)  of  an  arc  by  Equation  l  =  rO.  The  two  cameras  are  associated  to  arcs  under  the 
same  angle,  which  directly  corresponds  to  a  change  in  orientation  of  the  vehicle.  The 
radii  are  described  by 


h 


(3.7) 


h 

]MT 


(3.8) 


Substituting  Equations  (3.7)  and  (3.8)  into  Equation  (3.6)  results  in  an  expression  for 
change  in  orientation  (A 0)  described  by 


A  0 


2cos(7)Zi  l2 


D 


■  sign(y2 


Vi)- 


(3.9) 
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Figure  3.8:  Curvature  of  the  Cross-Correlation. 

The  vehicle  coordinate  frame,  the  b- frame,  is  described  in  Figure  3.14;  it  is  an 
orthogonal  frame  with  the  two  cameras  rigidly  attached  to  the  vehicle  body.  The 
cameras  are  in  a  linear  orientation  separated  by  a  distance  D.  The  formulas  for 


calculating  the  coordinates  at  the  end  of  the  sampling  period  are 

x[  =  ri(sin(o!i  +  Ad)  —  sin(ou))  •  sign(Ad),  (3.10) 

Vi  =  J'i(cos(o'i)  —  cos(«i  +  A 9))  •  sign(Ad)  —  — ,  (3-11) 

x2  =  T2(sin(o:2  +  A 9)  —  sin(a2))  •  sign(Ad),  (3-12) 

i/2  =  r2(cos(a2)  -  cos(a2  +  A 9))  ■  sign(A6*)  +  y .  (3.13) 
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Figure  3.9:  Cross-Correlation:  No  Correlation  Between  Images  Determined. 
The  change  in  vehicle  position  in  the  x  and  y  direction  can  be  resolved  using 

A.=  ^,  (3.14) 

Ay  =  (3.15) 

The  position  of  the  vehicle,  in  the  6-frame,  at  time  6+1  can  be  computed  by 
knowing  the  coordinates  at  time  t  and  the  relative  movement  carried  out  during  the 
period  from  t  to  t  +  1  with 


Xt+i  =  Xt  +  \J  Ax2  +  Ay2  cos 


+  arctan 


(3.16) 


Yt+ 1  =  Yt  +  \J  Aa:2  +  A  r/2  sin 


^©t  +  arctan 


(3.17) 
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Figure  3.10:  Curvature  of  the  Cross-Correlation:  No  Correlation  Between  Images. 


©t+1  =  0t  +  A0  (3.18) 

This  process  describes  how  the  relative  pose  of  the  vehicle  is  resolved  by  ap¬ 
plying  the  displacement  outputs  from  each  camera  cross-correlation  to  the  algorithm 
described  in  Section  3.1.4.  No  previous  research  has  fused  the  technique  of  image  reg¬ 
istration  from  two  downward  looking  cameras  applied  as  an  odometric  sensor.  The 
hardware  setup  and  results  to  test  the  theory  of  this  algorithm  will  be  presented  in 
Chapter  IV. 

3.2  Performance  Metrics 

The  primary  metric  to  evaluate  the  performance  of  the  optical  flow  based  reg¬ 
istration  algorithm  is  the  evaluation  of  the  absolute  value  of  difference  between  the 
predicted  (AXP,  A Yp)  and  the  truth  (AXt,  A Yt) 
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Figure  3.11:  Optical  Flow  From  Single  Camera  Image  Processing. 


X-error  =  \AXP  —  AXt\,  (3.19) 

Y-error  =  |A1^  —  Al^|.  (3.20) 

To  represent  the  multidimensional  displacement,  the  Euclidean  distance  (l2 
Norm)  between  the  predicted  (AXp,  A Yp)  and  the  truth  (AXt,  A Yt) 

Error  =  yj( AXp  -  A Xt)2  +  (A Yp  -  AYt)2,  (3.21) 

also  provides  another  metric  for  evaluation. 
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Figure  3.12:  Image  Captures  From  Single  Camera  Image  Processing. 

3. 3  Summary 

Chapter  III  presented  the  methodology  used  to  develop  an  optical  flow  based 
registration  algorithm.  Chapter  IV  presents  the  hardware  and  testing  parameters  used 
to  test  the  optical  flow  based  registration  algorithm  followed  by  detailed  results  and 
analysis.  Finally,  recommendations  for  future  work  and  final  remarks  will  conclude 
this  thesis  in  Chapter  V. 


Figure  3.13:  Cross-Correlation  from  Single  Camera  Image  Processing. 
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Figure  3.14:  Positioning  of  the  Cameras  in  the  Body  Frame.  Camera  1  and  Camera 
2  are  Parallel  to  Each  other  and  Separated  by  a  Distance  D. 


Figure  3.15:  Two  different  Paths  That  Return  the  Same  Readings. 
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Figure  3.16:  Triangle  Used  to  Resolve  A 9  Based  on  the  Cosine  Rule. 
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IV.  Analysis  and  Results 

This  chapter  presents  hardware  and  testing  parameters  used  to  test  the  optical  flow 
based  registration  algorithm.  An  analysis  of  simulated  as  well  as  real-world  results 
are  then  presented,  followed  by  an  analysis  of  the  results. 

4-1  Hardware  Overview 

The  navigation  system  is  shown  in  Figure  4.1,  the  navigation  reference  sys¬ 
tem  is  highlighted  in  green  and  the  image-aided  navigation  system  is  highlighted  in 
purple.  The  navigation  reference  system  consisted  of  a  tactical  grade  Inertial  Measure¬ 
ment  Unit  (IMU),  GPS  receiver  and  a  Synchronous  Position,  Attitude  and  Navigation 
(SPAN)  receiver.  The  image-aided  navigation  system  used  consisted  of  a  navigation 
computer  and  a  pair  of  digital  cameras. 

4-1.1  Digital  Cameras.  The  digital  cameras  are  both  PixeLINK  PL-A741 
machine  vision  cameras.  The  PL-A741  camera  uses  a  monochrome  complementary 
metal  oxide  semiconductor  (CMOS)  imaging  sensor  with  1280x1024  pixel  (1.3  mega 
pixel)  resolution.  The  pixels  are  6.7  /rm  square  which  results  in  a  sensor  size  of  8.576 
mmx6.921  mm.  The  cameras  are  paired  with  VS  Technology  Corp  SV-0514MP  lenses. 
These  lenses  feature  high  resolution  with  low  distortion  and  lock  screws  for  focus  and 
iris  with  a  maximum  sensor  size  of  2/3  inch  [31].  The  camera  communicates  with  a 
laptop  computer  using  an  IEEE-1394a  (FireWire)  interface,  which  collect  images  at 
10  Hz.  Finally,  the  camera  includes  a  global  shutter  which  exposes  all  of  the  pixels 
simultaneously.  The  complete  PL-A741  specifications  are  listed  in  [32],  The  two 
cameras  are  mounted  on  a  rigid  bracket  pointed  downward  (toward  the  ground)  at 
90  degrees  relative  to  the  vehicle,  on  the  right  side,  14.29  cm  from  the  ground.  They 
were  spaced  45.4  centimeters  apart.  Figure  4.2  shows  a  diagram  of  the  golf  cart  setup. 

4-1.2  Tactical  Grade  IMU.  The  Honeywell  HG1700  is  a  tactical-grade, 
strapdown  IMU.  The  unit  consists  of  a  GG1308  ring-laser  gyroscope  and  triad  of 
RBA-500  accelerometers  which  produce  measurements  at  100  Hz.  The  gyroscopes 
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Figure  4.1:  Implementation  Block  Diagram. 

and  accelerometers  have  a  dynamic  range  of  ±1000  degrees  per  second  and  ±50g, 
respectively. 

4-1.3  SPAN.  Novatel,  Inc’s  synchronized  position,  attitude  and  navigation 
(SPAN)  receiver  was  used  to  interface  GPS  and  the  tactical  grade  IMU.  The  SPAN 
provided  the  truth  source  for  evaluation  of  the  optical  flow  based  odometry  algorithm. 

4-1-4  Vehicle  Platform.  The  coordinate  frame  for  the  system  is  shown  in 
Figure  4.2.  The  system  is  configured  in  a  right-hand  orthogonal  manner  with  vehicle 
movement  in  the  forward  direction  being  positive  y  relative  to  the  6- frame.  The 
origin  of  the  6-frame  is  designed  to  coincide  with  the  origin  of  the  SPAN  s-frame. 
Also,  the  origin  of  the  SPAN  is  rotated  to  align  with  the  6-frame.  The  x,  y,  and 
z  coordinates  coincide  with  the  right  hand  rule.  Observe  that  when  the  vehicle  is 
moving  in  the  forward  direction  only,  the  positive  y  direction,  it  results  in  a  positive 
m  pixel  movement  and  a  right  hand  turn  would  result  in  a  positive  x  measurement 
and  a  positive  n  pixel  movement. 
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Figure  4.2:  Vehicle  platform  setup. 


Note  that  the  imaging  system  and  the  SPAN  are  both  mounted  on  a  rigid  body, 
each  experiencing  the  same  roll  ((f)),  pitch  (0),  and  yaw  (ip)  as  the  vehicle.  The  imaging 
system  is  translated  from  the  SPAN  by  the  vector  pbcamera  bar  =  (— 0.4853,  —0.8128,  —0.6890) 
meters. 


4-2  Simulation  Results 

The  outdoor  environment  was  chosen  to  test  the  optical  flow  based  algorithm 
implemented  with  two  camera  collection.  The  collect  was  taken  during  the  day  and 
covered  the  path  seen  in  Figure  4.3.  Images  were  captured  at  a  rate  of  10  Hz  during 
the  outdoor  collect. 
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Figure  4.3:  Vehicle  platform  setup. 


The  post-processing  and  optical  flow  algorithm  is  completed  using  Mat  lab® 
R2010a  and  follows  the  steps  outlined  in  Chapter  III.  To  evaluate  the  performance 
of  this  implementation  an  alpha  (A)  parameter  sweep  was  performed  to  determine 
the  optimal  threshold  for  accurate  displacement  detection,  the  predicted  path  was 
compared  to  the  truth  path  and  evaluated  with  the  accuracy  statistics. 

4-2.1  Peak  Parameter  Sweep.  Recall  from  Chapter  III  using  image  registra¬ 
tion  techniques  and  evaluating  the  maximum  of  the  cross-correlation  estimates  the 
displacement  between  two  images.  Curvature  of  the  cross-correlation  provides  a  mea¬ 
sure  to  assess  the  correctness  of  the  calculated  shift.  Evaluation  of  the  correctness  of 
the  cross-correlation  peak  can  be  done  based  on  how  much  the  maximum  calculated 
cross-correlation  value  is  above  the  average  cross-correlation  values,  this  parameter, 
A,  is  calculated  using  Equation  2.15. 
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Figure  4.4:  Alpha  (A)  Parameter  Sweep  For  X-Direction  Test  Setup. 

The  Peak  Parameter  Sweep  used  the  single  image  test  setup,  previously  dis¬ 
cussed  in  Section  3.1.1,  using  one  large  image  (4111  x  6300  pixels)  shown  in  Figure 
3.2  and  cropping  two  1024  x  768  pixel  images,  denoted  by  the  solid  rectangle  and  the 
dashed  rectangle  in  Figure  3.2.  The  alpha  parameter  (A)  was  evaluated  at  1000  single 
pixel  shifts  in  the  positive  ^-direction  (Figure  4.4),  positive  redirection  (Figure  4.5), 
and  diagonal  positive  ^-direction  and  redirection  (Figure  4.6).  The  alpha  parameter 
is  assessed  in  the  positive  direction  only  because  it  is  assumed  that  the  vehicle  will 
always  be  moving  in  a  positive  redirection  or  not  moving.  Figure  4.7  shows  the  av¬ 
erage  alpha  value  per  pixel  shift  for  the  test  runs.  The  red  line  represents  a  correct 
displacement  prediction,  the  blue  line  represents  an  incorrect  displacement  prediction, 
and  the  black  dot  is  the  a  threshold  value. 

A  threshold  of  A  =  10  is  set  to  measure  the  correctness  of  the  cross-correlation. 
The  threshold  value  will  allow  the  algorithm  to  ignore  incorrect  predicted  shifts,  thus 
producing  a  more  accurate  prediction  vehicle  pose. 


Figure  4.5:  Alpha  (A)  Parameter  Sweep  For  Y-Direction  Test  Setup. 

4-2.2  Displacement  Error  Calculation.  The  primary  metric  to  evaluate  the 
performance  of  the  optical  flow  based  registration  algorithm  is  the  evaluation  of  the 
absolute  value  of  the  difference  between  the  predicted  (AXP,  A  Yp)  and  the  truth  (A  Ah. 
A Yt).  This  performance  metric  is  calculated  with  Equations  3.19  and  3.20  for  the  A- 
direction  and  Y-direction  respectively.  The  difference  between  the  displacements  in 
the  A-direction  are  shown  in  Figure  4.8  and  Figure  4.9  for  the  Y-direction.  Tables 
4.1  and  4.2  present  the  statistics  for  each  direction  of  error  distributions  respectively. 


Table  4.1:  Statistics  for  Difference  Between  Predicted  X-Displacements  and  Truth 
X-Displacements. 


Parameter 

Value 

Mean 

3.099  cm 

Std  Dev 

6.733  cm 

Observe  that  the  difference  in  both  the  A-direction  and  the  Y-direction  hover 
around  the  means  respectively,  with  a  few  outliers.  The  outlier  values  can  be  explained 
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Figure  4.6:  Alpha  (A)  Parameter  Sweep  For  Diagonal  Test  Setup. 

Table  4.2:  Statistics  for  Difference  Between  Predicted  Y-Displacements  and  Truth 
Y-Displacements. 


Parameter 

Value 

Mean 

3.454  cm 

Std  Dev 

9.537  cm 

by  examining  the  average  velocity  of  the  run,  shown  in  Figure  4.10.  The  velocity  is 
steadily  increasing  and  stabilizes  around  0.4  ”,  this  translates  to  a  consistent  image 
overlap,  around  time  step  350  there  is  an  abrupt  spike  in  the  velocity  causing  the  image 
overlap  to  be  significantly  decreased  compared  to  the  rest  of  the  test  run.  This  spike 
in  velocity  directly  translated  to  the  error  in  the  the  X-direction  and  Y-direction. 

To  represent  the  multidimensional  displacement,  the  Euclidean  distance  (l2 
Norm)  between  the  predicted  (AXP,  A Yp)  and  the  truth  (AXt,  A Yt)  is  calculated 
with  Equation  3.21.  The  difference  between  the  displacements  are  shown  in  Fig¬ 
ure  4.11  and  Table  4.3  present  the  subsequent  error  distribution  statistics  for  each 
direction  of  error  distributions  respectively. 
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Figure  4.7:  Alpha  (A)  Parameter  Sweep  Data. 


Observe  that,  consistent  with  the  individual  direction  assessments,  the  (l2  Norm) 
error  hovers  around  the  mean  and  has  a  large  spike  consistent  with  the  spike  in 
velocity.  Although  this  anomaly  in  the  data  is  consistent  with  the  spike  in  the  velocity 
there  are  a  number  of  other  sources  of  possible  error.  For  instance,  changes  in  camera 
distance  from  the  ground  from  uneven  surfaces  will  introduce  camera  calibration  error. 
Also  notice  the  “gaps”  in  the  data,  where  all  data  points  are  zero,  these  a  gaps  are 


Table  4.3:  Statistics  for  Multidimensional  Displacement  Differences. 


Parameter 

Value 

Mean 

1.347  cm 

Std  Dev 

15.268  cm 
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Figure  4.8:  Difference  Between  Predicted  X-Displacements  and  Truth  X- 
Displacements. 


due  to  no  image  correlation  calculation  which  may  be  associated  with  previously 
mentioned  sources  of  error. 


4-3  Summary 

Chapter  IV  presented  the  hardware  and  testing  parameters  used  to  test  the 
validity  of  the  optical  flow  based  registration  algorithm.  Testing  resulted  in  2  cm 
level  accuracy  with  an  a  parameter  setting  of  10.  Finally,  recommendations  for  future 
work  and  final  remarks  will  conclude  this  thesis  in  Chapter  V. 
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Figure  4.9:  Difference  Between  Predicted  Y-Displacements  and  Truth  Y- 
Displacements. 


Time  Step 


Figure  4.10:  Average  Velocity  Over  Test  Run 
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V.  Conclusions 


This  chapter  presents  the  conclusions  of  this  thesis  and  recommendations  for  future 
work.  Section  5.1  presents  an  evaluation  of  the  algorithm  developed.  Improvements 
for  the  navigation  system  and  recommendations  for  the  next  phase  in  research  are 
presented  in  Section  5.2. 

5.1  Conclusions 

Analysis  of  the  optical  flow  based  phase  correlation  algorithm  began  with  char¬ 
acterization  of  a  single  image  presented  in  Section  3.1.1.  This  technique  proved  to 
be  beneficial  for  initial  augmentation  of  the  algorithm  because  truth  data  and  test¬ 
ing  options  were  readily  available  and  reliable.  Next,  the  algorithm  was  tested  using 
real-world  images  from  a  single  camera  presented  in  Section  3.1.2. 

The  concept  of  using  image  registration  methods  to  conduct  optical  flow  odome- 
try  is  a  relatively  new  technique.  By  reviewing  image  registration  algorithms  analyzed 
by  [11]  as  well  as  the  results  produced  in  this  research,  the  phase  correlation  method 
is  “regarded  as  a  robust  estimator  in  the  presence  of  noise”  [12]. 

The  system  description  in  Chapter  IV  outlines  a  system  that  captures  images 
at  a  rate  of  10  Hz  while  the  vehicle  traverses  through  an  environment.  The  optical 
flow  odometry  algorithm  then  analyzes  the  translational  motion  between  images  to 
produce  the  displacement  in  vehicle  pose  over  time.  These  results  can  be  further 
extended  to  produce  pose  estimation. 

This  research  also  found  that  phase  correlation  methods  will  result  in  a  dis¬ 
placement  estimation  regardless  of  whether  or  not  a  specific  shift  was  found  between 
2  images.  Therefore,  an  alpha  parameter  was  established  to  detect  such  false  posi¬ 
tives.  Setting  a  =  10  yielded  no  false  positives  throughout  the  dataset.  This  greatly 
improves  the  accuracy  of  the  overall  technique. 

The  application  of  the  optical  flow  based  image  registration  revealed  that  the 
spatial  pose  displacement  averaged  only  1.3  cm  in  error  throughout  the  data  set. 
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This  shows  that  further  extension  to  pose  estimation  would  yield  similar  metric  level 
accuracy.  This  proves  that  this  process  is  a  viable  solution  for  navigation  purposes. 
While  the  scope  of  this  research  was  limited  to  optical  flow  odometry,  it  can  be  further 
expanded  and  characterized  for  mapping  capabilities. 

5.2  Recommendations  for  Future  Research 

This  section  discusses  areas  for  future  work  which  would  provide  further  ad¬ 
vancement  of  the  navigation  capability  presented  in  this  thesis.  Investigation  into 
the  algorithm  response  to  differing  terrain  responses,  varying  image  capture  rates, 
and  limited  lighting  conditions  that  replicate  what  will  be  encountered  in  the  under¬ 
ground  tunnel  setting  will  further  develop  the  optical  flow  based  registration  algorithm 
for  implementation  into  an  autonomous  mobile  robot. 

Errors  associated  with  changes  in  velocity  of  vehicle  may  be  able  to  be  reduced 
or  eliminated  by  using  a  variable  image  capture  rate  that  directly  correlates  with 
the  velocity  of  the  vehicle.  The  ability  to  dynamically  adjust  the  image  capture  rate 
based  on  the  velocity  may  significantly  decrease  the  amount  of  data  storage  needed 
and  increase  accuracy.  For  example,  if  the  vehicle  is  traversing  down  a  long  straight 
tunnel  at  a  constant  velocity  of  a  capture  rate  of  5  Hz  may  be  adequate  for  accuracy 
cross-correlation  of  the  images  but  if  the  vehicle  encounters  a  downhill  and  the  velocity 
of  the  vehicle  increased  a  capture  rate  of  10  Hz  may  be  needed  to  accurately  calculate 
displacement  [34], 

Another  area  for  future  research  dealing  with  image  exposure  may  pose  a  sig¬ 
nificant  increase  in  algorithm  functionality.  Normalizing  the  image  exposure  for  each 
image  capture  by  synchronizing  a  strobe  light  that  matches  the  frame  rate  may  min¬ 
imize  the  amount  of  discrepancies  between  images  that  do  not  correlate  with  actual 
displacement  [34], 

Another  area  for  future  research  is  investigation  into  a  solution  of  uneven  sur¬ 
faces  encountered  in  the  underground  tunnel  environment.  Mounting  laser  range 
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finders  in  parallel  with  each  of  the  cameras  could  provide  the  ability  to  more  accu¬ 
rately  determine  roll,  pitch  and  yaw  of  the  cameras  which  could  potentially  allow 
for  camera  calibration  adjustments  on  the  fly  to  account  for  the  differences.  Cam¬ 
era  calibration  on  the  fly  could  potentially  produce  more  accurate  image  capture  for 
increased  displacement  accuracy  [34], 

To  prove  the  true  viability  of  the  optical  flow  based  registration  algorithm  pre¬ 
sented  in  this  thesis  is  to  integration  into  a  Kalman  filter  to  resolve  actual  path 
traversed  and  development  of  a  map.  The  integration  into  a  Kalman  filter  would 
make  it  possible  reconstruct  the  actual  path  the  vehicle  traversed  for  real-time  image 
processing  to  update  IMU  for  an  accurate  position  solution. 

The  ultimate  goal  of  this  research  is  for  this  navigation  technique  to  be  imple¬ 
mented  into  a  full  scale  rover  that  is  able  to  accurately  and  autonomously  navigate 
through  an  underground  tunnel- like  environment.  This  capability  meets  the  immedi¬ 
ate  needs  of  users  such  as  the  US  Homeland  Security  and  our  military. 
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